Binary mask estimation based on frequency modulations

نویسندگان

  • Chung-Chien Hsu
  • Jen-Tzung Chien
  • Tai-Shih Chi
چکیده

In this paper, a binary mask estimation algorithm is proposed based on modulations of speech. A multi-resolution spectrotemporal analytical auditory model is utilized to extract modulation features to estimate the binary mask, which is often used in speech segregation applications. The proposed method estimates noise from the beginning of each test sentence, a common approach seen in many conventional speech enhancement algorithms, to further enhance the modulation features. Experimental results demonstrate that the proposed method outperforms the AMS-GMM system in terms of the HIT-FA rate when estimating the binary mask.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Techniques for Estimating the Ideal Binary Mask

This paper provides a comparison of binary mask estimation techniques, based on different ways of estimating the instantaneous SNR. The effect of six different gain functions and three noise estimation algorithms on estimating the SNR, and subsequently the binary mask was assessed. New criteria are proposed for classifying time-frequency bins as belonging to the target or masker signals. Senten...

متن کامل

Asr-driven Binary Mask Estimation for Robust Automatic Speech Recognition

Additive noise has long been an issue for robust automatic speech recognition (ASR) systems. One approach to noise robustness is the removal of noise information through segregation by binary time-frequency masks; each time-frequency unit in a spectro-temporal representation of the speech signal is labeled either noise-dominant or signal-dominant. The noise-dominant units are masked and their e...

متن کامل

Frequency-domain parameter estimations for binary masked signals

We present an approach for the extraction of parameters of a damped complex exponential model from a spectrogram modified by a binary mask. The parameters are estimated by a frequency domain based methods using subspace techniques, where the core algorithm is F-ESPRIT. The sub-band defined by the binary mask provides a reduced number of DFT-samples for the parameter extractions, which results i...

متن کامل

A new noise-tracking algorithm for generalizing binary time-frequency (T-F) masking to ratio masking

In this paper, we attempt to generalize the ideal binary mask (IBM) estimation to the ideal ratio mask (IRM) estimation. Under binary masking, the error in IBM estimation may greatly distort the original speech spectrum. The main purpose of this paper is using ratio mask to smooth this negative impact. Since the key issue is the noise tracking, we firstly use exponential distributions to model ...

متن کامل

A Two Stage Mask Estimation Approach to Robust Speaker Verification

We propose a two-stage mask estimation approach to robust speaker verification (SV) in noise environments. We consider a practical semi-blind SV scenario: the location of the target speaker is fixed while the locations of all interferers are unknown. In the first stage, we use a dual-microphone and a semi-blind degenerate unmixing estimation technique (DUET) to estimate an initial binary mask. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014